Recognition of Out-of-vocabulary Words and Their Semantic Category

نویسنده

  • F Gallwitz
چکیده

In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a signiicant amount of out-of-vocabulary (OOV) words even when the vocabulary size is very large. In this paper we present a new approach for the integration of OOV words into statistical language models. It is based on the fact that the context of an OOV word contains information on its semantic category. This allows us to predict the semantic category of the OOV word during the word recognition process. This information is useful for improved recognition of the word neighborhood, and can also be used by postpro-cessing modules, e.g. in spoken dialog systems. Although we use a simple acoustic model for OOV words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تشخیص دست‌نوشتۀ‌ برخط فارسی با استفاده از مدل زبانی و کاهش قوانین نگارش کاربر

The Joint-up, cursive form of Persian words and immense variety of its scripts, also different figures of Persian letters depending on their sitting positions in the words, have turned the Persian handwritings recognition to an intense challenge. The major obstacle of the most often recognition ways, is their inattention to sentence contexture which causes utilizing of a word with correct appea...

متن کامل

Word clustering effect on vocabulary learning of EFL learners: A case of semantic versus phonological clustering

The aim of this study is to determine the effect of word clustering method on vocabulary learning of Iranian EFL learners through a case of semantic versus phonological clustering. To this effect, 80 homogeneous students from four intermediate classes at an English institute in Torbat e Heydariyeh participated in this research. They were assigned to four groups according to semantic versus phon...

متن کامل

Lexical leverage: category knowledge boosts real-time novel word recognition in 2-year-olds.

Recent research suggests that infants tend to add words to their vocabulary that are semantically related to other known words, though it is not clear why this pattern emerges. In this paper, we explore whether infants leverage their existing vocabulary and semantic knowledge when interpreting novel label-object mappings in real time. We initially identified categorical domains for which indivi...

متن کامل

Semantic processing survey of spoken and written words in adolescents with cerebral palsy: Evidence from PALPA word-picture matching test

Objective: The present study aimed to assess and compare semantic processing of spoken and written words in adolescents with cerebral palsy and healthy adolescents. Method: The present study is quantitative in terms of type and experimental in terms of method. Examination Group consisted 30 adolescents with cerebral palsy aged 10 to 15 years were selected by convenience sampling method. All of ...

متن کامل

The Impact of Semantic Clustering on Iranian EFL Advanced Learners’ Vocabulary Retention

This study investigated the impact of semantic clustering on Iranian EFL learners’ vocabulary retention at advanced level. Participants were female learners randomly assigned to two groups of 15. Four instruments (TOEFL test; vocabulary pretest; immediate posttest, and delayed recall posttest) were used. The experimental group underwent semantic clustering vocabulary presentation in which the l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997